err 0
A Honest Cross-Validation Estimator for Prediction Performance
Pan, Tianyu, Yu, Vincent Z., Devanarayan, Viswanath, Tian, Lu
Cross-validation is a standard tool for obtaining a honest assessment of the performance of a prediction model. The commonly used version repeatedly splits data, trains the prediction model on the training set, evaluates the model performance on the test set, and averages the model performance across different data splits. A well-known criticism is that such cross-validation procedure does not directly estimate the performance of the particular model recommended for future use. In this paper, we propose a new method to estimate the performance of a model trained on a specific (random) training set. A naive estimator can be obtained by applying the model to a disjoint testing set. Surprisingly, cross-validation estimators computed from other random splits can be used to improve this naive estimator within a random-effects model framework. We develop two estimators -- a hierarchical Bayesian estimator and an empirical Bayes estimator -- that perform similarly to or better than both the conventional cross-validation estimator and the naive single-split estimator. Simulations and a real-data example demonstrate the superior performance of the proposed method.
- North America > United States > Indiana (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Adversarial Subspace Generation for Outlier Detection in High-Dimensional Data
Cribeiro-Ramallo, Jose, Matteucci, Federico, Enciu, Paul, Jenke, Alexander, Arzamasov, Vadim, Strufe, Thorsten, Böhm, Klemens
Outlier detection in high-dimensional tabular data is challenging since data is often distributed across multiple lower-dimensional subspaces -- a phenomenon known as the Multiple Views effect (MV). This effect led to a large body of research focused on mining such subspaces, known as subspace selection. However, as the precise nature of the MV effect was not well understood, traditional methods had to rely on heuristic-driven search schemes that struggle to accurately capture the true structure of the data. Properly identifying these subspaces is critical for unsupervised tasks such as outlier detection or clustering, where misrepresenting the underlying data structure can hinder the performance. We introduce Myopic Subspace Theory (MST), a new theoretical framework that mathematically formulates the Multiple Views effect and writes subspace selection as a stochastic optimization problem. Based on MST, we introduce V-GAN, a generative method trained to solve such an optimization problem. This approach avoids any exhaustive search over the feature space while ensuring that the intrinsic data structure is preserved. Experiments on 42 real-world datasets show that using V-GAN subspaces to build ensemble methods leads to a significant increase in one-class classification performance -- compared to existing subspace selection, feature selection, and embedding methods. Further experiments on synthetic data show that V-GAN identifies subspaces more accurately while scaling better than other relevant subspace selection methods. These results confirm the theoretical guarantees of our approach and also highlight its practical viability in high-dimensional settings.
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- (2 more...)
- Research Report > Experimental Study (0.92)
- Research Report > New Finding (0.87)
- Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Hyperparameter tuning via trajectory predictions: Stochastic prox-linear methods in matrix sensing
Lou, Mengqi, Verchand, Kabir Aladin, Pananjady, Ashwin
This model finds applications in diverse areas of science and engineering, including astronomy, medical imaging, and communications (Jefferies and Christou, 1993; W ang and Poor, 1998; Campisi and Egiazarian, 2017). For instance, it forms an example of the blind deconvolution problem in statistical signal processing (see, e.g., Recht et al. (2010); Ahmed et al. (2013) and the references therein for several applications of this problem). W e are interested in the model-fitting problem, and the natural least squares population objectiveL: R d R d R (corresponding to the scaled negative log-likelihood of our observations under Gaussian noise) can be written as L ( µ, ν) = E nullnull y x, µ z, ν null 2null, (2) where the conditional distribution of y given x, z is as specified by the model (1). Note that L is a jointly nonconvex function in the parameters ( µ, ν) . With the goal of minimizing the population loss L, we consider online algorithms which operate on a mini-batch of size m with 1 m d for which we draw a fresh 1 set of observations { y i, x i, z i} m i = 1 at each iteration and form the averaged loss L m( µ, ν) = 1 m m i = 1 null y i x i, µ z i, ν null 2 .
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)
- Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
- Workflow (0.68)
- Research Report (0.64)
The Power of Learned Locally Linear Models for Nonlinear Policy Optimization
Pfrommer, Daniel, Simchowitz, Max, Westenbroek, Tyler, Matni, Nikolai, Tu, Stephen
A common pipeline in learning-based control is to iteratively estimate a model of system dynamics, and apply a trajectory optimization algorithm - e.g.~$\mathtt{iLQR}$ - on the learned model to minimize a target cost. This paper conducts a rigorous analysis of a simplified variant of this strategy for general nonlinear systems. We analyze an algorithm which iterates between estimating local linear models of nonlinear system dynamics and performing $\mathtt{iLQR}$-like policy updates. We demonstrate that this algorithm attains sample complexity polynomial in relevant problem parameters, and, by synthesizing locally stabilizing gains, overcomes exponential dependence in problem horizon. Experimental results validate the performance of our algorithm, and compare to natural deep-learning baselines.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Pennsylvania (0.04)
- North America > United States > Massachusetts (0.04)
- (3 more...)
- Research Report (0.63)
- Workflow (0.45)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.65)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)